Tidy First?
Tags: #technology #software engineering #design #programming #refactoring #agile
Authors: Kent Beck
Overview
My book, Tidy First? explores the question that has perplexed many programmers: when faced with messy code that needs to change, do you tidy the code first, or make the change and tidy later (or never)? Drawing upon my decades of experience in software design, I offer practical guidance and a theoretical framework to help you navigate this decision. Tidy First? is written primarily for programmers, lead developers, and those who work hands-on with code, emphasizing an approach I call “empirical software design.” This approach rejects the extremes of speculative design, where excessive upfront design anticipates all future needs, and reactive design, where design is only considered when features become nearly impossible to add. Instead, empirical software design advocates for making design decisions based on observations of what makes changes hard, applying just enough design to ease the pain of the next behavior change. The book introduces the concept of “tidyings,” which are tiny, incremental design improvements that make code easier to understand and change. It dives into specific tidyings, such as using guard clauses, removing dead code, normalizing symmetries, and creating helper functions, illustrating each with code examples. I also emphasize the importance of working in small steps, creating separate pull requests for tidyings, and understanding how tidyings can chain together to create a cascade of improvements. To help readers make informed decisions about when to tidy, I delve into the economics of software design. I introduce the ideas of the time value of money and optionality, showing how these concepts affect the timing of design decisions. I explain how tidying can reduce coupling, which is the interconnectedness of elements in a system. Excessive coupling makes changes expensive and risky. By reducing coupling, you make the system more flexible and easier to evolve. I also discuss cohesion, the principle that related elements should be grouped together. By improving cohesion, you make the system easier to understand and modify. While this first book in the Empirical Software Design series focuses on the individual programmer’s relationship with their own code, it sets the stage for exploring broader applications of these ideas in the context of teams and organizations. Ultimately, my goal is to help you make software design an ordinary, balanced part of your daily development practice, leading to more enjoyable, efficient, and valuable software.
Book Outline
0. Introduction
This book introduces the concept of Tidy First, a software design approach that encourages small, incremental improvements to messy code before making significant behavioral changes. It emphasizes caring for ourselves as developers by making our work more manageable and focusing on creating a positive coding experience.
Key concept: Software design is an exercise in human relationships. Why don’t we take time to care for ourselves? Take time to make our work easier? Why do we go down the rabbit hole of cleaning code to the exclusion of work that would help our users?
1. Guard Clauses
Guard clauses help make preconditions for a code block explicit. They simplify complex nested conditions and enhance readability. However, avoid overusing guard clauses as too many can hinder comprehension.
Key concept: Only tidy to a guard clause if the prompt is met precisely:
if (condition) …all the rest of the code in the routine…
2. Dead Code
Dead code, code that never gets executed, should be removed. Version control systems allow you to retrieve it if necessary. Deleting dead code enhances readability and reduces unnecessary cognitive load.
Key concept: “A little” is a cognitive measure, not a lines-of-code measure.
3. Normalize Symmetries
Normalize Symmetries by making identical code look identical and different code look different. This involves standardizing solutions to recurring problems like lazy initialization. Consistent code improves readability and understanding.
Key concept: Pick a way. Convert one of the variants into that way. Tidy one form of unnecessary variation at a time - lazy initialization, for example, first.
4. New Interface, Old Implementation
When an existing interface is cumbersome, create a new interface that simplifies the interaction. Implement this new interface by calling the old one. Later, once all callers are using the new interface, you can refactor the underlying implementation.
Key concept: Creating a pass-through interface is the micro-scale essence of software design.
5. Reading Order
Arrange code in a reading order that makes it easy for others (and your future self) to understand. Consider what details and context are essential for comprehension and present them in a logical sequence.
Key concept: No single ordering of elements is perfect. Sometimes you want to understand the primitives first and then understand how they compose. Sometimes you want to understand the API first and then understand the details of implementation.
6. Cohesion Order
Improve cohesion order by placing related code elements close to each other. While eliminating coupling is ideal, it may not always be feasible. Sometimes, reordering code to improve cohesion is enough to make future changes easier.
Key concept: cost(decoupling) + cost(change) < cost(coupling) + cost(change)
7. Move Declaration and Initialization Together
Enhance readability by moving declarations and initializations of variables close together. This reinforces the understanding of the variable’s purpose and role in the code. Consider data dependencies when reordering code and work in small steps.
Key concept: Play around with the order. Is it easier to read and understand the code if each of the variables is declared and initialized just before it’s used, or if they’re all declared and initialized together at the top of the function?
8. Explaining Variables
Introduce explaining variables to simplify complex expressions. Extract parts of the expression into well-named variables that reflect their intention. This improves readability and makes it easier to modify individual parts of the expression later.
Key concept: As always, separate the tidying commit from the behavior change commit.
9. Explaining Constants
Replace magic numbers and repeated strings with explaining constants. This makes the code more self-documenting and easier to understand. Be careful about identical literals having different meanings in different contexts.
Key concept: You’re reading. You understand. You’re putting that understanding into the code so you don’t have to hold it in your head.
10. Explicit Parameters
Use explicit parameters instead of relying on implicit data sources like maps or environment variables. This makes the inputs to a function clear and enhances testability and understandability.
Key concept: This will make your code easier to read, test, and analyze.
11. Chunk Statements
Chunk statements into logical groups by adding blank lines between them. This simple act of tidying improves readability and allows you to identify patterns and potential areas for further improvement.
Key concept: Done well, software design enables software design that enables change.
12. Extract Helper
Extract blocks of code with a clear purpose into helper functions. Name the function after its intent, not how it achieves it. Helper functions promote reusability and enhance readability.
Key concept: Fondness is not the only reason to keep helpers around. Frequently you’ll find yourself wanting to use your new helper again hours or even minutes after you’ve created it.
13. One Pile
Sometimes over-decomposition of code hinders comprehension. In such cases, create One Pile by inlining small pieces back into a larger function. This can reveal new patterns and simplify subsequent refactoring.
Key concept: Given the bias toward more, smaller pieces, creating one pile feels odd while tidying. However, it’s strangely satisfying.
14. Explaining Comments
Use explaining comments to capture insights that aren’t immediately obvious from the code. Avoid redundant comments that merely repeat what the code says.
Key concept: Immediately upon finding a defect is a good time to comment.
15. Delete Redundant Comments
Delete redundant comments that no longer add value. Tidying can make comments redundant by making the code itself clearer.
Key concept: Tidyings often chain together. A previous tidying may have made a comment redundant.
16. Separate Tidying
Keep tidying changes separate from behavior changes. Create separate pull requests (PRs) for tidyings and ensure that each PR contains only a few tightly related tidyings. This helps reviewers understand and approve changes quickly.
Key concept: The tidyings have to go somewhere, or you don’t tidy. Where do they go? Summary: they go in their own PRs, with as few tidyings per PR as possible.
17. Chaining
Each tidying can pave the way for further tidyings, creating a chaining effect. For example, a guard clause can lead to introducing explaining variables or extracting helpers.
Key concept: Tidying becomes a game of chess, with moves visible ahead. Let’s look at how tidyings set up further tidyings:
18. Batch Sizes
The ideal batch size for tidyings depends on several factors like the amount of tidying needed, the risk of collisions, and the cost of review. Aim for small batches, but adjust based on your context and team’s practices.
Key concept: Figure 18-5: Reduce the cost of review to reduce the cost of tidying by shrinking batches
19. Rhythm
Tidying should be done in a rhythmic fashion, aiming for minutes to an hour of tidying before making behavior changes. Behavior changes tend to cluster in code, and so do tidyings. This creates a natural flow where tidying supports future changes.
Key concept: Behavior changes tend to cluster in the code. From Pareto, 80% of the changes will occur in 20% of the files.
20. Getting Untangled
When you find yourself tangled in a mix of tidyings and behavior changes, consider restarting and tidying first. While discarding work can be difficult, it often leads to a cleaner and more understandable codebase in the long run.
Key concept: The sunk cost fallacy complicates the choice between these options.
21. First, After, Later, Never
Tidy First? emphasizes tidying code before making behavioral changes, but it acknowledges that it’s not always the optimal approach. Sometimes it’s better to tidy after, later, or never. The key is to be aware of the tradeoffs and make conscious decisions based on the context.
Key concept: “How would we work if we had enough time?”
22. Beneficially Relating Elements
Software design is about beneficially relating elements. This involves understanding the elements of a system, their relationships (like invokes, publishes, listens, refers), and the benefits derived from those relationships.
Key concept: “Software design”: beneficially relating elements
23. Structure and Behavior
Software creates value through its behavior today and its potential for future changes. Behavior can be described in terms of input/output pairs and invariants. Software design aims to enable change and preserve these behavioral aspects.
Key concept: Input/output pairs This many hours at this pay rate in this jurisdiction should result in a paycheck like this and a tax filing like that. Invariants The sum of all entitlements should equal the sum of all deductions.
25. A Dollar Today > A Dollar Tomorrow
The time value of money encourages tidying after over tidying first. This means prioritizing immediate revenue-generating changes and deferring tidying if it doesn’t directly contribute to earning sooner.
Key concept: A dollar today is worth more than a dollar tomorrow, so earn sooner and spend later.
26. Options
Optionality is a valuable aspect of software design. By making the system easy to change, you create options for future behavior changes, increasing the overall value of the software. The more uncertain the future, the more valuable options become.
Key concept: “What behavior can I implement next?” has value all on its own, even before I implement it.
27. Options Versus Cash Flows
The decision to tidy first depends on whether the total cost of tidying and making the behavior change is less than the cost of making the change without tidying. If so, tidy first; otherwise, consider other timings.
Key concept: cost(tidying) + cost(behavior change after tidying) < cost(behavior change without tidying)
28. Reversible Structure Changes
Structure changes are generally reversible, making them safer than behavior changes. Invest less in avoiding mistakes with reversible changes and focus on making them quickly.
Key concept: Because there is so little value to avoiding mistakes, we shouldn’t invest much in doing so.
29. Coupling
Coupling refers to the interconnectedness of elements in a system. Two elements are coupled with respect to a change if modifying one necessitates modifying the other. Coupling increases the cost of making changes and can lead to cascading changes.
Key concept: coupled(E1, E2, Δ) ≡ ΔE1 ⇒ ΔE2
30. Constantine’s Equivalence
Constantine’s Equivalence states that the cost of software is approximately equal to the cost of changing it. The most expensive changes are those that require modifying many coupled elements. Reducing coupling is essential to reduce software development cost.
Key concept: cost(software) ~= cost(change) ~= cost(big changes) ~= coupling
31. Coupling Versus Decoupling
While decoupling reduces the cost of future changes, it also has a cost. There’s a tradeoff between coupling and decoupling, and the optimal level of coupling depends on the specific context and the expected changes to the system.
Key concept: Figure 31-1. Cost of coupling trades off with cost of decoupling
32. Cohesion
Cohesion is related to coupling. Coupled elements should belong to the same containing element, while uncoupled elements should be placed elsewhere. Improving cohesion often involves extracting cohesive subelements like helper functions.
Key concept: Figure 32-1. Incohesive element improved either by (top) extracting a cohesive subelement or by (bottom) moving uncoupled subelements elsewhere
33. Conclusion
Ultimately, the decision to tidy first comes down to judgment. Consider the cost, revenue, coupling, and cohesion implications of tidying, and remember that the goal is to make software development more enjoyable and efficient for you and your colleagues.
Key concept: Coupling conducts one tidying to the next to the next.
Essential Questions
1. What is “Tidy First?” and how does it differ from other software design approaches?
Tidy First is a software design approach that prioritizes making small, incremental improvements to code structure (“tidyings”) before making behavioral changes. It suggests that by improving readability and reducing coupling, future changes become easier and less error-prone. This approach aims to make software design an ongoing and integrated part of development rather than a separate, upfront activity. It recognizes that we often discover better design as we work with the code and advocates for making design decisions based on observed needs and opportunities for simplification.
2. When should you tidy code, and what factors should you consider when deciding whether to “Tidy First?”?
The key to knowing when to tidy is to consider the economics of software development. Discounted cash flow tells us to earn sooner and spend later, discouraging upfront tidying. However, optionality suggests that spending money now (on tidying) can create more valuable options for future changes. The decision hinges on whether the total cost of tidying first is less than the cost of making the change without tidying, considering both immediate and future changes.
3. What is coupling, and how does tidying help reduce it? Why is minimizing coupling crucial in software design?
Coupling, a core concept in software design, refers to the interconnectedness of elements in a system. High coupling means changes in one element necessitate changes in many others, leading to expensive, cascading changes. Tidying, particularly through techniques like extracting helper functions and normalizing symmetries, can help reduce coupling by creating clearer boundaries between elements and reducing dependencies. Reducing coupling lowers the cost of software development by making changes easier, faster, and safer.
4. What is cohesion, and how does tidying help improve it? How does cohesion contribute to a better design?
Cohesion refers to grouping related elements together within a containing element. Coupled elements should be placed in the same module or function, while unrelated elements should be moved elsewhere. High cohesion contributes to better code organization and understanding. Tidying can improve cohesion by extracting cohesive subelements and rearranging code to group related elements, making the system more understandable and maintainable.
5. How does Kent Beck define “software design,” and what are the key elements of this definition?
Software design, according to Kent Beck, is about “beneficially relating elements.” It involves identifying the elements of the system, understanding their relationships (like invokes, publishes, listens, refers), and ensuring these relationships create benefits like reducing complexity, enhancing flexibility, and supporting future changes. This definition highlights the human aspect of software design, emphasizing the need to consider how design decisions impact developers, users, and the overall value of the software.
Key Takeaways
1. Embrace iterative and incremental design over grand, upfront design.
“Tidy First?” advocates for an iterative and incremental approach to software design, focusing on making small, focused improvements to the code structure before making significant behavioral changes. This aligns with the principles of agile development and lean methodologies, where the emphasis is on delivering value quickly and adapting to changing requirements. By tidying first, you make it easier to incorporate feedback, experiment with different solutions, and avoid the pitfalls of over-engineering.
Practical Application:
When designing a new AI model, instead of trying to build a perfect model upfront, start with a simple, working model and iteratively improve its design based on observed performance and challenges in integrating with other systems. This allows for faster feedback and avoids investing heavily in design that might not be necessary.
2. Work in small steps and keep tidying changes separate from behavioral changes.
The book strongly emphasizes the importance of working in small, controlled steps when tidying code. Creating separate pull requests for tidying changes and behavioral changes allows for focused reviews, reduces the risk of introducing errors, and ensures that everyone understands the rationale behind design decisions. It also creates a traceable history of design improvements, making it easier to understand how the code evolved over time.
Practical Application:
When working on a complex AI project involving multiple team members, establish a practice of creating separate pull requests for code tidyings and behavioral changes. This allows for focused reviews, reduces the risk of introducing errors, and ensures that everyone understands the rationale behind design decisions. It also creates a traceable history of design improvements.
3. Minimize coupling to reduce the cost and complexity of future changes.
Coupling, the interconnectedness of elements in a system, is a major driver of cost and complexity in software development. “Tidy First?” highlights various tidyings that can help reduce coupling, such as using explicit parameters, extracting helper functions, and normalizing symmetries. By minimizing coupling, you make the system more flexible, easier to change, and less prone to cascading errors when modifications are made.
Practical Application:
When developing an AI system for a rapidly evolving field like natural language processing, prioritize creating a flexible and adaptable architecture that can easily incorporate new algorithms and data sources. This involves minimizing coupling between modules and designing clear interfaces that can be extended without requiring widespread changes.
4. Strive for high cohesion to improve code organization and understandability.
Cohesion, the principle of grouping related elements together, is another crucial aspect of good software design. “Tidy First?” encourages improving cohesion by extracting cohesive subelements, moving unrelated elements elsewhere, and arranging code to cluster related functionality. High cohesion leads to more understandable and maintainable systems.
Practical Application:
When building an AI-powered recommendation system, group related components like data processing, model training, and recommendation generation into cohesive modules. This improves code organization, makes it easier for different team members to work on specific aspects of the system, and reduces the cognitive load when understanding or modifying the code.
5. Prioritize code clarity and readability to create a more positive and productive development experience.
“Tidy First?” emphasizes the human aspect of software development, recognizing that developers are not just instructing computers, but also communicating with each other through code. By prioritizing code clarity and readability, you create a more enjoyable and productive coding experience for yourself and your colleagues. This, in turn, leads to better software, faster development, and a more positive work environment.
Practical Application:
When leading an AI product development team, encourage a culture of continuous improvement through regular code tidyings. Recognize that time spent on tidying is an investment that pays off in the long run by making future development faster, smoother, and less error-prone. Emphasize the benefits of creating a more enjoyable and productive coding experience for the team.
Suggested Deep Dive
Chapter: Chapter 27: Options Versus Cash Flows
This chapter delves into the economic forces that influence when to tidy code. It explores the tension between the time value of money, which discourages upfront investment in design, and the concept of optionality, where upfront investment can create more valuable future choices. Understanding these principles is crucial for making informed decisions about when to tidy and how much design to do in advance, striking a balance between short-term gains and long-term flexibility.
Memorable Quotes
What is Tidy First?. 10
Software design is an exercise in human relationships. In Tidy First? we start with the proverbial person in the mirror - with the programmer’s relationship with themself.
How I Came to Write Tidy First?. 14
Hours later I looked up, absolutely enthralled. Here were Newton’s laws of motion, but for software design. It was all so clear when it came out. How did we as an industry forget that clarity?
One Pile. 40
The biggest cost of code is the cost of reading and understanding it, not the cost of writing it.
Structure and Behavior. 77
Options are the economic magic of software - especially the option to expand.
Reversible Structure Changes. 90
There seems to be an idealistic form of geek thinking that holds that if only we made decisions better, we would never make mistakes.
Comparative Analysis
Unlike traditional software design books that focus on abstract principles or specific methodologies, “Tidy First?” offers a pragmatic and personalized approach, aligning with works like “Refactoring” by Martin Fowler, which emphasizes improving existing code, and “Working Effectively with Legacy Code” by Michael Feathers, which provides strategies for managing technical debt. However, “Tidy First?” goes beyond these by explicitly linking code tidying to economic principles like the time value of money and optionality, a perspective that distinguishes it from other works. It also shares similarities with “A Philosophy of Software Design” by John Ousterhout in its emphasis on minimizing coupling and maximizing cohesion, but “Tidy First?” advocates for a more iterative and incremental approach rather than striving for perfect upfront design. While agreeing with the importance of these principles, “Tidy First?” recognizes that the perfect design is often unknowable in advance and advocates for making design decisions based on observed needs and opportunities.
Reflection
“Tidy First?” offers a refreshing perspective on software design, grounded in practicality and economic principles. It challenges the notion of striving for the “perfect” design upfront and instead encourages an iterative approach that embraces change and uncertainty. While the book focuses primarily on the individual programmer, its insights can be extended to teams and organizations. By fostering a culture of continuous improvement through tidying, teams can reduce the cost and complexity of software development, leading to faster delivery, higher quality, and a more positive work environment. However, some might argue that the emphasis on tidying could lead to excessive focus on code structure at the expense of delivering user value. It’s crucial to remember that tidying is a means to an end, not an end in itself. The ultimate goal is to create software that meets user needs, and tidying should be done strategically to support this goal. Overall, “Tidy First?” is a valuable contribution to the field of software design, providing practical advice and a theoretical framework that can help developers make more informed design decisions, leading to more valuable and sustainable software.
Flashcards
What are “tidyings” in the context of Tidy First?
Tiny, incremental design improvements that make code easier to understand and change.
What is dead code, and what should you do with it?
Code that never gets executed. It should be deleted to enhance readability.
What does it mean to “Normalize Symmetries”?
Making identical code look identical and different code look different to improve consistency and readability.
What is a guard clause?
A conditional statement at the beginning of a function that checks for specific conditions and returns early if those conditions are met.
What is the purpose of extracting helper functions?
Extracting blocks of code into separate functions with clear names and purposes to improve reusability and readability.
What is coupling in software design?
The interconnectedness of elements in a system. High coupling makes changes expensive and risky.
What is cohesion in software design?
Grouping related elements together within a containing element. It improves code organization and understanding.
What does Constantine’s Equivalence state?
The cost of software is approximately equal to the cost of changing it. Reducing coupling is crucial to minimize cost.
What is the time value of money?
Earning money sooner and spending money later to maximize the value of money over time.
What is optionality in software design?
Creating a system that is flexible and easy to change, allowing for future behavioral changes without incurring high costs.